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Abstract We show how to combine Bayes nets and game theory to predict the be- 
havior of hybrid systems involving both humans and automated components. We 
call this novel framework "Semi Network-Form Games," and illustrate it by pre- 
dicting aircraft pilot behavior in potential near mid-air collisions. At present, at the 
beginning of such potential collisions, a collision avoidance system in the aircraft 
cockpit advises the pilots what to do to avoid the collision. However studies of mid- 
air encounters have found wide variability in pilot responses to avoidance system 
advisories. In particular, pilots rarely perfectly execute the recommended maneu- 
vers, despite the fact that the collision avoidance system's effectiveness relies on 
their doing so. Rather pilots decide their actions based on all information available 
to them (advisory, instrument readings, visual observations). We show how to build 
this aspect into a semi network-form game model of the encounter and then present 
computational simulations of the resultant model. 



1 Introduction 



Bayes nets have been widely investigated and commonly used to describe stochas- 
tic systems flT] [TO] |26) . Powerful techniques already exist for the manipulation, in- 
ference, and learning of probabilistic networks. Furthermore, these methods have 
been well-established in many domains, including expert systems, robotics, speech 
recognition, and networking and communications lfT9l . On the other hand, game 
theory is frequently used to describe the behavior of interacting humans EH9). A 
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vast amount of experimental literature exists (especially in economic contexts, such 
as auctions and negotiations), which analyze and refine human behavior models 
fl4j[5]II3. These two fields have traditionally been regarded as orthogonal bodies 
of work. However, in this work we propose to create a modeling framework that 
leverages the strengths of both. 

Building on earlier approaches J2]|20l, we introduce a novel framework, "Semi 
Network-Form Game," (or "semi net-form game") that combines Bayes nets and 
game theory to model hybrid systems. We use the term "hybrid systems" to mean 
such systems that may involve multiple interacting human and automation compo- 
nents. The semi network-form game is a specialization of the complete framework 
"network-form game," formally defined and elaborated in |32|. 

The issue of aircraft collision avoidance has recently received wide attention 
from aviation regulators due to some alarming near mid-air collision (NMAC) 
statistics E4ll . Many discussions call into question the effectiveness of current sys- 
tems, especially that of the onboard collision avoidance system. This system, called 
"Traffic Alert and Collision Avoidance System (TCAS)," is associated with many 
weaknesses that render it increasingly less effective as traffic density grows expo- 
nentially. Some of these weaknesses include complex contorted advisory logic, ver- 
tical only advisories, and unrealistic pilot models. In this work, we demonstrate how 
the collision avoidance problem can be modeled using a semi net-form game, and 
show how this framework can be used to perform safety and performance analyses. 

The rest of this chapter is organized as follows. In Section |2j we start by estab- 
lishing the theoretical fundamentals of semi net-form games. First, we give a formal 
definition of the semi net-form game. Secondly, we motivate and define a new game 
theoretic equilibrium concept called "level-K relaxed strategies" that can be used 
to make predictions on a semi net-form game. Motivated by computational issues, 
we then present variants of this equilibrium concept that improve both computa- 
tional efficiency and prediction variance. In Section|3] we use a semi net-form game 
to model the collision avoidance problem and discuss in detail the modeling of a 2- 
aircraft mid-air encounter. First, we specify each element of the semi net-form game 
model and describe how we compute a sample of the game theoretic equilibrium. 
Secondly, we describe how to extend the game across time to simulate a complete 
encounter. Then we present the results of a sensitivity analysis on the model and 
examine the potential benefits of a horizontal advisory system. Finally, we conclude 
via a discussion of semi net-form game benefits in Section|4]and concluding remarks 
in Section [5] 



2 Semi Network-Form Games 



Before we formally define the semi net-form game and various player strategies, we 
first define the notation used throughout the chapter. 
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Our notation is a combination of standard game theory notation and standard Bayes 
net notation. The probabilistic simplex over a space Z is written as A z . Any Carte- 
sian product x yeY Az is written as A z \y. So A Z \ Y is the space of all possible condi- 
tional distributions of z G Z conditioned on a value y g Y. 

We indicate the size of any finite set A as \A\. Given a function g with domain 
X and a subset Y C X, we write g(T) to mean the set {g(x) : x G F}. We couch 
the discussion in terms of countable spaces, but much of the discussion carries over 
to the uncountable case, e.g., by replacing Kronecker deltas 8 a .b with Dirac deltas 
5{a-b). 

We use uppercase letters to indicate a random variable or its range, with the 
context making the choice clear. We use lowercase letters to indicate a particular 
element of the associated random variable's range, i.e., a particular value of that 
random variable. When used to indicate a particular player i, we will use the notation 
— i to denote all players excluding player i. We will also use primes to indicate 
sampled or dummy variables. 



2.2 Definition 

A semi net-form game uses a Bayes net to serve as the underlying probabilistic 
framework, consequently representing all parts of the system using random vari- 
ables. Non-human components such as automation and physical systems are de- 
scribed using "chance" nodes, while human components are described using "de- 
cision" nodes. Formally, chance nodes differ from decision nodes in that their con- 
ditional probability distributions are pre-specified. Instead each decision node is 
associated with a utility function, which maps an instantiation of the net to a real 
number quantifying the player's utility. To fully specify the Bayes net, it is neces- 
sary to determine the conditional distributions at the decision nodes to go with the 
distributions at the chance nodes. We will discuss how to arrive at the players' con- 
ditional distributions (over possible actions), also called their "strategies," later in 



Section 2.6 We now formally define a semi network-form game as follows: 



Definition 1. An (jV-player) semi network-form game is a quintuple (G,X,u,R, n) 

where 

1. G is a finite directed acyclic graph {V,E}, where V is the set of vertices and E is 
the set of connecting edges of the graph. We write the set of parent nodes of any 
node v e V as pa(v) and its successors as succiv). 

2. X is a Cartesian product of |V| separate finite sets, each with at least two ele- 
ments, with the set for element v GV written as X v , and the Cartesian product of 
sets for all elements in pa(v) written as X pa ^,y 

3. m is a function X — > Mr. We will typically view it as a set of N utility functions 
u; '. X — y R. 
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4. R is a partition of V into + 1 subsets the first N of which have exactly one 
element. The elements of R(l) through R(N) are called "Decision" nodes, and 
the elements of R(N + 1) are "Chance" nodes. We write D = U^ =l R(i) and C = 
R(N+l). 

5. K is a function from v G R(N + 1) — > Ay i x , v , . (In other words, 7T assigns to 
every v € R(N + 1) a conditional probability distribution of v conditioned on the 
values of its parents.) 

Intuitively, X v is the set of all possible states at node v, m, is the utility function of 
player i, R(i) is the decision node set by player i, and n is the fixed set of distributions 
at chance nodes. As an example, a normal-form game \22 \ is a semi net-form game 
in which E is empty. As another example, let v be a decision node of player i that has 
one parent, v'. Then the conditional distribution P(X v i | X pa (y\) is a generalization 
of an information set. 

A semi net-form game is a special case of a general network-form game |32| . 
In particular, a semi net-form game allows each player to control only one decision 
node, whereas the full network-form game makes no such restrictions allowing a 
player to control multiple decision nodes in the net. Branching (via "branch nodes") 
is another feature not available in semi net-form games. Like a net-form game, 
Multi- Agent Influence Diagrams |20| also allow multiple nodes to be controlled 
by each player. Unlike a net-form game, however, they do not consider bounded 
rational agents, and have special utility nodes rather than utility functions. 



2.3 A Simple Semi Network-Form Game Example 

We illustrate the basic understandings of semi net-form games using the sim- 
ple example shown in Figure [T| In this example, there are 6 random variables 
(A,B,C,D,P\,P2) represented as nodes in the net; the edges between nodes define 
the conditional dependence between random variables. For example, the probability 
of D depends on the values of Pi and Pz, while the probability of A does not depend 
on any other variables. We distinguish between the two types of nodes: chance nodes 
(A,B,C,D), and decision nodes (Pi,/^)- As discussed previously, chance nodes dif- 
fer from decision nodes in that their conditional probability distributions are speci- 
fied a-priori. Decision nodes do not have these distributions pre-specified, but rather 
what is pre-specified are the utility functions {U\ and Ui) of those players. Using 
their utility functions, their strategies P{P\ \ B) and P(P2 \ C) are computed to com- 
plete the Bayes net. This computation requires the consideration of the Bayes net 
from each player's perspective. 

Figure [2] illustrates the Bayes net from Pi's perspective. In this view, there are 
nodes that are observed (P), there are nodes that are controlled (Pi), and there are 
nodes that do not fall into any of these categories (A,C,P2,D), but appear in the 
player's utility function. This arises from the fact that in general the player's utility 
function can be a function of any variable in the net. As a result, in order to evaluate 
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Fig. 1 A simple net-form game example: Fixed conditional probabilities are specified for chance 
nodes (A,B,C,D), while utility functions are specified for decision nodes (Pi,P2)- Players try to 
maximize their expected utility over the Bayes net. 



the expected value of his utility function for a particular candidate action (sometimes 
we will use the equivalent game theoretic term "move"), Pi must perform inference 
over these variables based on what he observe^] Finally, the player chooses the 
action that gives the highest expected utility. 



2.4 Level-K Thinking 



Level-K thinking is a game theoretic equilibrium concept used to predict the out- 
come of human-human interactions. A number of studies 12 [3j IU |6] |8] [33l have 
shown promising results predicting experimental data in games using this method. 
The concept of level-K is defined recursively as follows. A level K player plays 
(picks his action) as though all other players are playing at level K— 1, who, in 
turn, play as though all other players are playing at level K — 2, etc. The process 
continues until level is reached, where the player plays according to a prespecified 
prior distribution. Notice that running this process for a player at K > 2 results in 
ricocheting between players. For example, if player A is a level 2 player, he plays 
as though player B is a level 1 player, who in turn plays as though player A is a 



We discuss the computational complexity of a particular equilibrium concept later in Sec- 
tion [2TT] 
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Chance Nodes: A, 


B, C 


D 


Decision Nodes: Pi 


, P 2 





Observe 



Control 



\Ji = F 1 (P 1 ,P 2 ,A,B,C,D) 




Infer 
(estimate) 



Fig. 2 A simple net-form example game from player l's perspective: Using information that he 
observes, the player infers over unobserved variables in the Bayes net in order to set the value of 
his decision node. 



level player. Note that player B in this example may not be a level 1 player in 
reality - only that player A assumes him to be during his reasoning process. Since 
this ricocheting process between levels takes place entirely in the player's mind, no 
wall clock time is counted (we do not consider the time it takes for a human to run 
through his reasoning process). We do not claim that humans actually think in this 
manner, but rather that this process serves as a good model for predicting the out- 
come of interactions at the aggregate level. In most games, K is a fairly low number 
for humans; experimental studies 0) have found K to be somewhere between 1 and 
2. 

Although this work uses level-K exclusively, we are by no means wedded to this 
equilibrium concept. In fact, semi net-form games can be adapted to use other mod- 
els, such as Nash equilibrium, Quantal Response Equilibrium, Quantal Level-K, and 
Cognitive Hierarchy. Studies [4, 33 1 have found that performance of an equilibrium 
concept varies a fair amount depending on the game. Thus it may be wise to use 
different equilibrium concepts for different problems. 



2.5 Satisficing 

Bounded rationality as coined by Simon [28 1 stems from observing the limitations 
of humans during the decision-making process. That is, humans are limited by the 
information they have, cognitive limitations of their minds, and the finite amount 
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of time they have to make decisions. The notion of satisficing |[5] |28] |29] states 
that humans are unable to evaluate the probability of all outcomes with sufficient 
precision, and thus often make decisions based on adequacy rather than by finding 
the true optimum. Because decision-makers lack the ability and resources to arrive 
at the optimal solution, they instead apply their reasoning only after having greatly 
simplified the choices available. 

Studies have shown evidence of satisficing in human decision-making. In recent 
experiments [5 1, subjects were given a series of calculations (additions and subtrac- 
tions), and were told that they will be given a monetary prize equal to the answer 
of the calculation that they choose. Although the calculations were not difficult in 
nature, they did take effort to perform. The study found that most subjects did not 
exhaustively perform all the calculations, but instead stopped when a "high enough" 
reward was found. 



2.6 Level-K Relaxed Strategies 

We use the notions of level-K thinking and satisficing to motivate a new game the- 
oretic equilibrium concept called "level-K relaxed strategies." For a player i to per- 
form classic level-K reasoning |4| requires / to calculate best response^] In turn, 
calculating best responses often involves calculating the Bayesian posterior proba- 
bility of what information is available to the other players, — i, conditioned on the 
information available to i. That posterior is an integral, which typically cannot be 
evaluated in closed form. 

In light of this, to use level-K reasoning, players must approximate those Bayesian 
integrals. We hypothesize that real-world players do this using Monte Carlo sam- 
pling. Or more precisely, we hypothesize that their behavior is consistent with their 
approximating the integrals that way. 

More concretely, given a node v, to form their best-response, the associated 
player i = R~ l (y) will want to calculate quantities of the form argmax Al , [E(m, | 
x v> x pa(v))]> where u\ is the player's utility, x v is the variable set by the player (i.e. his 
move), and x pa i v \ is the realization of his parents that he observes. We hypothesize 
that he (behaves as though he) approximates this calculation in several steps. First, 
M candidate moves are chosen via IID sampling the player's satisficing distribution. 
Now, for each candidate move, he must estimate the expected utility resulting from 
playing that move. He does this by sampling the posterior probability distribution 
P K (Xv | x v ,x pa (y\) (which accounts for what he knows), and computing the sample 
expectation uf . Finally, he picks the move that has the highest estimated expected 
utility. Formally, we give the following definition: 



2 We use the term best response in the game theoretic sense, i.e. the player chooses the move with 
the highest expected utility. 
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Definition 2. Consider a semi network-form game (G,X,u,R,n) with level K 1 
relaxed strategic ^A K - 1 (X v , | X pu[vl) ) denned for all v' <E D and K > 1. For all nodes 
v and sets of nodes Z in such a semi net-form game, define 

1. U = V\{v,pa(v)}, 

2. P K (X V | X pa(v) ) = n(X v | X pa{v) ) if v e C 

3. P K (X V | X Mv) ) =A K -\X V | Z Mv) ) if v € D, and 

4. J*(Az) = Ilr"EzP K (Xv" I ^«(v"))- 

Definition 3. Consider a semi network-form game (G,X ,u,R,n). For all v € D, 
specify an associated level distribution A°(X V \ x pa ^) S Ay v |x , g .Xy an d an as- 
sociated satisficing distribution A(X V | x ;x ,( v )) € 4x„|x v , epa v) x y/ - Also specify count- 
ing numbers M and M' . 

For any AT > 1, the level K relaxed strategy of node v € D is the conditional dis- 
tribution A (X v | x pa ( v ) ) £ A Xv \ x , £ ( / sampled by running the following stochas- 
tic process independently for each x pa i v \ £ X pa i v y. 

1. Formaset {x[,(j) : j= 1,...,M} ofHD samples of X{X V \x pa i v \) and then remove 
all duplicates. Let m be the resultant size of the set; 

2. For j = 1, . . . ,m, form a set {x' v (k;x' v (j)) : k = 1, . . .M'} of IID samples of the 
joint distribution 

P K {X v \x' v {j),x pa[v) ) = Y\P K (X V , \X paW )5x Mv)>XMv) S XvA{ jy, 
v'ev 

and compute 

j M' 

M k=l 

where x' v (;x' v (j)) is shorthand for {x' v ,(k 1 x' v (j)) ■ v' € V,k = 1, . . . ,M'} 

3. Return where 7* = argmax^fif (x' u (;x' v (j)),x' v U), x pa(v))}- 

Intuitively, the counting numbers M and M' can be interpreted as a measure of 
a player's rationality. Take, for example, M — > °° and M' — > °°. Then the player's 
entire movespace would be considered as candidate moves, and the expected utility 
of each candidate move would be perfectly evaluated. Under these circumstances, 
the player will always choose the best possible move, making him perfectly rational. 
On the other hand if M = 1, this results in the player choosing his move according 
to his satisficing distribution, corresponding to random behavior. 

One of the strengths of Monte Carlo expectation estimation is that it is unbiased 
l25l . This property carries over to level-K relaxed strategies. More precisely, con- 
sider a level K relaxed player i, deciding which of his moves {x / v (j) : j € l,...,m} 
to play for the node v he controls, given a particular set of values x pa ^ that he 



3 We will define level-K relaxed strategies in Definition 3 
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observes. To do this he will compare the values &f (^u(>^v(j))>^vU)> x pa(v))- These 
values are all unbiased estimates of the associated conditional expected utility^Jeval- 
uated under an equivalent Bayes Net 17 defined in Theorem[T] Formally, we have the 
following: 

Theorem 1. Consider a semi net-form game (G,X,u,R, K) with associated satisfic- 
ing X(X V | x pa ^) and level distribution A°(X V \ x pa t v \) specified for all players. 

Choose a particular player i of that game, a particular level K, and a player 
move x v — x J v (j)from Definition^for some particular j. Consider any values x pa ^ 
where v is the node controlled by player i. Define I] as any Bayes net whose directed 
acyclic graph is G, where for all nodes v' g C, Pq (X v i \ X pa (y\ ) = n{X v i \ X pa i v i^), for 
all nodes v" G D, P^ (X v ii \ X pa ^ v ii^ ), and where P^ (X v \ X pa ^ ) is arbitrary so long as 
Prj{x v | X pa i v \) 7^ 0. We also define the notation Pq(Xz) for a set Z of nodes to mean 

Ylv'ezPrXXv 1 I ^pa(v'))- 

Then the expected value E(uf | -*(,(./) > x pa(v)) evaluated under the associated 
lev el-K relaxed strategy equals E(uj \ x Vl x pa ^) evaluated under the Bayes net I;. 

Proof. Write 



E ("f KC/'),-v(v)) 

/ dx'yUU)) P(x'vUU)) I x[{j),x pa{v) )uf{x' v {-x[,{j))) 




— J dX v P K (X V | x v ,x pa ( v ))ui(Xi/,x v ,x pa (y-)) 

I dX v P K (X u ,x v ,x pa ^,- ) )u i (X u ,x v ,x pa ^) 
f dXu P K (X v , x v , x pa (y) ) 

_ JdX V Y\v'euP K { X v' \X pa (v'))Ilv"epa(v)P K ( x v" \X pa (v"))P K {Xv \ X pa ( v ))ui{Xu ,X V) X pa (y)) 
J dX v Uz'sU P K [ X z> I X pa(z') ) Uz!'epa(v) P K I X pa(z») )P K (*v I X pa(y) ) 

_ JdX v \\ v i eU P K (X v i | Xpa(y') ) Y[y"epa{v) P ( x v" I X pa(v")) u i( X U ,X v ,X pa ^) 
I dXu Uz'sU P K { X z> I X pa(z')) Y\z"epa(v) pK {*? I X pa(z") ) 

_ JdX v l\ v 'euPri(X v ' \ x pa {v l ))X\v"epa(v)Pr i {x^ \ X pa (y^)ui(Xu,x v ,x pa (y)) 

JdXu Uz'euPri^z' \ X pa(z'))T\z"ep a (v)Pr i ( x z." \ x P a(z")) 

4 Note that the true expected conditional utility is not defined without an associated complete Bayes 
net. However, we show in Theorem[T]Proof that the expected conditional utility is actually inde- 
pendent of the probability f^(X v , | X pa ^) and so it can chosen arbitrarily. We make the assumption 
that Pp t (xy | x po ( v j) ^ for mathematical formality to avoid dividing by zero in the proof. 
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— J dX v PqiXv I Xv,Xpa(v)) u i(Xu,Xv,Xpa(v)) 

= E(ui\x v ,x pa{v) ) □ 

In other words, we can set P (x v \ x pa r v \) arbitrarily (as long as it is nonzero) 
and still have the utility estimate evaluated under the associated level-K relaxed 
strategy be an unbiased estimate of the expected utility conditioned on x v and x pa ( v \ 
evaluated under I]. Unbiasness in level-K relaxed strategies is important because the 
player must rely on a limited number of samples to estimate the expected utility of 
each candidate move. The difference of two unbiased estimates is itself unbiased, 
enabling the player to compare estimates of expected utility without bias. 



2.7 Level-K d-Relaxed Strategies 

A practical problem with relaxed strategies is that the number of samples may grow 
very quickly with depth of the Bayes net. The following example illustrates another 
problem: 

Example 1. Consider a net form game with two simultaneously moving players, Bob 
and Nature, both making M-valued moves. Bob's utility function is given by the 
difference between his and Nature's move. 

So to determine his level 1 relaxed strategy, Bob chooses M candidate moves by 
sampling his satisficing distribution, and then Nature chooses M' ("level 0") moves 
for each of those M moves by Bob. In truth, one of Bob's M candidate moves, x* Boh , 
is dominanj^jover the other M—l candidate moves due to the definition of the utility 
function. However since there are an independent set of M' samples of Nature for 
each of Bob's moves, there is nonzero probability that Bob won't return x* Bob , i.e., 
his level 1 relaxed strategy has nonzero probability of returning some other move. 

As it turns out, a slight modification to the Monte Carlo process defining relaxed 
strategies results in Bob returning x* B()b with probability 1 in Example [T] for many 
graphs G. This modification also reduces the explosion in the number of Monte 
Carlo samples required for computing the players' strategies. 

This modified version of relaxed strategies works by setting aside a set Y of 
nodes which are statistically independent of the state of v. Nodes in Y do not have 
to be resampled for each value x v . Formally, the set Y will be defined using the 
dependence-separation (d-separation) property concerning the groups of nodes in 
the graph G that defines the semi net-form game 1 19, 20, 23 1. Accordingly, we call 
this modification "d-relaxed strategies." Indeed, by not doing any such resampling, 
we can exploit the "common random numbers" technique to improve the Monte 
Carlo estimates [25|. Loosely speaking, to choose the move with the highest es- 
timate of expected utility requires one to compare all pairs of estimates and thus 

We use the term dominant in the game theoretic sense, i.e., the move gives Bob the highest 
expected utility no matter what move Nature makes. 
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implicitly evaluate their differences. Recall that the variance of a difference of two 
estimates is given by Var(% — v) = Var(x) + Var(v) — 2Cov(x,v). By using d- 
relaxed strategies, we expect the covariance Cov(x,v) to be positive, reducing the 
overall variance in the choice of the best move. 

Definition 4. Consider a semi network-form game (G,X,u,R,n) with level K— 1 
d-relaxed strategies^] A K ~ l (X v i \ X pa ^) defined for all v' £ D and K > 1. For all 
nodes v and sets of nodes Z in such a semi net-form game, define 

1. S v = succ(v), 

2. S- v = V\{vUS r }, 

3. Y = V\{vUpa(v)US r }, 

4. P K (X V | X pa(v) ) = %{X V | X pa(v) ) if v e C, 

5. P K (X V | X p<v) ) =A K - 1 (X V | X pa{v) ) if v £ D, and 

6- P K (Xz) =n v "ezP K (X v » \ X pa (v»))- 

Note that F U pa{v) — S~ v and v U S v U 5 ,_v = V. The motivation for these defi- 
nitions comes from the fact that Y is precisely the set of nodes that are d-separated 
from v by pa{v). As a result, when the player who controls v samples X v conditioned 
on the observed x pa ^,y the resultant value x v is statistically independent of the values 
of all the nodes in Y. Therefore the same set of samples of the values of the nodes 
in Y can be reused for each new sample of X v . This kind of reuse can provide sub- 
stantial computational savings in the reasoning process of the player who controls 
v. We now consider the modified sampling process noting that a level-K d-relaxed 
strategy is defined recursively in K, via the sampling of P . Note that in general, 
Definition [3] and Definition [5] do not lead to the same player strategies (conditional 
distributions) as seen in Example [T] 

Definition5. Consider a semi network-form game (G,X ,u,R,n) with associated 
level distributions A°(X V \ x pa i v \) and satisficing distributions X(X V \ x pa r v \). Also 
specify counting numbers M and M'. 

For any K > 1, the level K d-relaxed strategy of node v £ D, where v is con- 
trolled by player i, is the conditional distribution A (X v | x pa ^) £ A Xy \ x , £ ( , 
that is sampled by running the following stochastic process independently for each 

x pa(v) ^ ^pa(v)- 

1 . Form a set {x[, (j ) : 7 = 1,..., M} of IID samples of X (X v \ x pa t v \ ) and then remove 
all duplicates. Let m be the resultant size of the set; 

2. Form a set {x / s _ v (k) : k — 1, . . . ,M'} of IID samples of the distribution overX s - r 
given by 

p K (x s -,\x pa{v) )= n ^i*MV))^ w w 



6 We will define level-K d-relaxed strategies in Definition 5 
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3. For j = 1, . . . ,m, form a set {x / sr (k,x' v (j)) :k = 1, . . . ,M'} of IID samples of the 
distribution over given by 

P K (X S r\ x ' Y (;), x ' v (j)^pa(v))= Y\ pK ^\ X paW)) 11 {k) S x ^ u y, 

v'eS" v"es- v ' 

and compute 

uf (x'y (; ) , 04 (;')), 4(i)-^(v) ) 

— £ Mi(4(^)>4v(fe,40'))>4(i)>^a(v)); 

where (; ) is shorthand for {x' v (k) :v£Y,k—l,... ,M'} andxy. (;x[,(j)) is short- 
hand for {x^ {k,x' v (j)) :k=l,... ,M'}. 

4. Return .<,(/) where j* = argmaXj-[fif (V y (; ),^G4U))X(j0>*Mv))]- 

Definition [5]requires directly sampling from a conditional probability, which re- 
quires rejection sampling. This is highly inefficient if pa(v) has low probability, and 
actually impossible if pa(v) is continuous. For these computational considerations, 
we introduce a variation of the previous algorithm based on likelihood-weighted 
sampling rather than rejection sampling. Although the procedure, as we shall see in 
Definition[7J is only able to estimate the player's expected utility up to a proportion- 
ality constant (due to the use of likelihood-weighted sampling |19|), we point out 
that this is sufficient since proportionality is all that is required to choose between 
candidate moves. Note that un-normalized likelihood-weighted level-K d-relaxed 
strategy, like level-K d-relaxed strategy, is defined recursively in K. 

Definition 6. Consider a semi network-form game (G,X,u,R, n) with unnormalized 
likelihood- weighted level K — 1 d-relaxed strategies 7 A K ~ l (X v i \ X pa (y>) defined for 
all v'eD and K > 1 . For all nodes v and sets of nodes Z in such a semi net-form 
game, define 

1. P K (X V | X pa(v) ) = 7t(X v I X pa(v) ) if v G C, 

2. P K (X r | X pa(v) ) =A K - 1 (X V I X paiy) ) if v G D, and 

3. P K {X Z ) = T\ v »ezP K {Xv" I Xpa(v"))- 

Definition 7. Consider a semi network-form game (G,X ,u,R,n) with associated 
level distributions A°(X V \ x pa f v \) and satisficing distributions X(X V \ x pa ^). Also 
specify counting numbers M and M', and recall the meaning of set Y from Defini- 
tion m 

For any K > 1, the un-normalized likelihood- weighted level K d-relaxed strat- 
egy of node v, where node v is controlled by player i, is the conditional distribution 
A (X v | Xp a ( v )) £ 4y,,|x , /g ( ,jX / tnat i s sampled by running the following stochastic 
process independently for each x pa ^ £ X pa / V y. 



7 We will define unnormalized likelihood-weighted level-K d-relaxed strategies in Definition 7 
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1. Form a set {x[,(j) : j = 1, . . . ,M} of IID samples of X(X V \ x pa i v \), and then re- 
move all duplicates. Let m be the resultant size of the set; 

2. Form a set of weight-sample pairs {(w'(A;),^_ v (A:)) : k — l,...M'} by setting 
^pa{v) = x pa(v)> sampling the distribution overXy given by 



and setting 



p k (Xy)= Y[P K {xAXp u{v >)) 

v'eY 



w >(k)= n 



3. For j = 1, . . . ,m, form a set {x' sv (k,x' v (j)) : k = 1, . . .M'} of IID samples of the 
distribution over X$v given by 

P K (X sv \ x > Y (;),x' v UU P a( v ))= Y\ pK ^\ x paW)) EI 8 x,,,A,(k) 8 x v AUY> 

v'ev v"es- v 

and compute 

Ui {x' Y (; ) , x' S v (;x' v (j)), x' v (j), x pa(v) ) 

I M' 

— ^ V^(*)K l <^y(i),4'(* : > JC 'v(j))X(i%V(v))'' 
M /t=l 

4. Return ^Cf) where = ^gmwi J iu i (x' Y (k),x'^(k,x' v U))' x 'vU)^ x pa(v))]' 



2.7.1 Computational Complexity 

Let be the number of players. Intuitively, as each level K player samples the Bayes 
net from their perspective, they initiate samples by all other players at level K — 1 . 
These players, in turn, initiate samples by all other players at level K — 2, continuing 
until level 1 is reached (since level players do not sample the Bayes net). 

As an example, Figure [3] enumerates the number of Bayes net samples required 
to perform level-K d-relaxed sampling for N = 3 where all players reason at K = 3. 
Each square represents performing the Bayes net sampling process once. As shown 
in the figure, the sampling process of Pa at level 3 initiates sampling processes in the 
two other players, Pg and Pq, at level 2. This cascading effect continues until level 
1 is reached, and is repeated from the top for Pg and Pq at level 3. In general, when 
all players play at the same level K, this may be conceptualized as having trees 
of degree N —\ and depth K; therefore having a computational complexity propor- 
tional to Y/j=o (N — 1) -'N, or 0(N K ). In other words, the computational complexity 
is polynomial in the number of players and exponential in the number of levels. 
Fortunately, experiments |4]|6l have found K to be small in human reasoning. 
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Level 3 



Level 2 



Level 1 




Fig. 3 Computational complexity of level-K d-relaxed strategies with N = 3 and K = 3: Each 
box represents a single execution of the algorithm. The computational complexity is found to be 
0(N K ). 



3 Using Semi Net-Form Games to Model Mid-Air Encounters 

TCAS is an aircraft collision avoidance system currently mandated by the Interna- 
tional Civil Aviation Organization to be fitted to all aircraft with a maximum take-off 
mass of over 5700 kg (12,586 lbs) or authorized to carry more than 19 passengers. 
It is an onboard system designed to operate independently of ground-based air traf- 
fic management systems to serve as the last layer of safety in the prevention of 
mid-air collisions. TCAS continuously monitors the airspace around an aircraft and 
warns pilots of nearby traffic. If a potential threat is detected, the system will issue 
a Resolution Advisory (RA), i.e., recommended escape maneuver, to the pilot. The 
RA is presented to the pilot in both a visual and audible form. Depending on the 
aircraft, visual cues are typically implemented on either an instantaneous vertical 
speed indicator, a vertical speed tape that is part of a primary flight display, or us- 
ing pitch cues displayed on the primary flight display. Audibly, commands such as 
"Climb, Climb!" or "Descend, Descend!" are heard. 

If both (own and intruder) aircraft are TCAS -equipped, the issued RAs are co- 
ordinated, i.e., the system will recommend different directions to the two aircraft. 
This is accomplished via the exchange of "intents" (coordination messages). How- 
ever, not all aircraft in the airspace are TCAS -equipped, i.e., general aviation. Those 
that are not equipped cannot issue RAs. 

While TCAS has performed satisfactorily in the past, there are many limitations 
to the current TCAS system. First, since TCAS input data is very noisy in the hor- 
izontal direction, issued RAs are in the vertical direction only, greatly limiting the 
solution space. Secondly, TCAS is composed of many complex deterministic rules, 
rendering it difficult for authorities responsible for the maintenance of the sys- 
tem (i.e., Federal Aviation Administration) to understand, maintain, and upgrade. 
Thirdly, TCAS assumes straight-line aircraft trajectories and does not take into ac- 
count flight plan information. This leads to a high false-positive rate, especially in 
the context of closely-spaced parallel approaches. 
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This work focuses on addressing one major weakness of TCAS: the design as- 
sumption of a deterministic pilot model. Specifically, TCAS assumes that a pilot 
receiving an RA will delay for 5 seconds, and then accelerate at 1/4 g to execute 
the RA maneuver precisely. Although pilots are trained to obey in this manner, a 
recent study of the Boston area [21 1 has found that only 13% of RAs are obeyed - 
the aircraft response maneuver met the TCAS design assumptions in vertical speed 
and promptness. In 64% of the cases, pilots were in partial compliance - the aircraft 
moved in the correct direction, but did not move as promptly or as aggressively as 
instructed. Shockingly, the study found that in 23% of RAs, the pilots actually re- 
sponded by maneuvering the aircraft in the opposite direction of that recommended 
by TCAS (a number of these cases of non-compliance may be attributed to visual 
flight rules^J. As air traffic density is expected to double in the next 30 years |[T3l . 
the safety risks of using such a system will increase dramatically. 

Pilot interviews have offered many insights toward understanding these statis- 
tics. The main problem is a mismatch between the pilot model used to design the 
TCAS system and the behavior exhibited by real human pilots. During a mid-air en- 
counter, the pilot does not blindly execute the RA maneuver. Instead, he combines 
the RA with other sources of information (i.e., instrument panel, visual observation) 
to judge his best course of action. In doing this, he quantifies the quality of a course 
of action in terms of a utility function, or degree of happiness, defined over possible 
results of that course of action. That utility function does not only involve proximity 
to the other aircraft in the encounter, but also involves how drastic a maneuver the 
pilot makes. For example, if the pilot believes that a collision is unlikely based on 
his observations, he may opt to ignore the alarm and continue on his current course, 
thereby avoiding any loss of utility incurred by maneuvering. This is why a pilot 
will rationally decide to ignore alarms with a high probability of being false. 

When designing TCAS, a high false alarm rate need not be bad in and of it- 
self. Rather what is bad is a high false alarm rate combined with a pilot's utility 
function to result in pilot behavior which is undesirable at the system level. This 
more nuanced perspective allows far more powerful and flexible design of alarm 
systems than simply worrying about the false positive rate. Here, this perspective 
is elaborated. We use a semi net-form game for predicting the behavior of a sys- 
tem comprising automation and humans who are motivated by utility functions and 
anticipation of one another's behavior. 

Recall the definition of a semi net-form game via a quintuple (G,X,u,R,7t) in 
Definition[T] We begin by specifying each component of this quintuple. To increase 
readability, sometimes we will use (and mix) the equivalent notation Z = Xz, z = xz, 
and z' = y! z for a node Z throughout the TCAS modeling. 



Visual flight rules are a set of regulations set forth by the Federal Aviation Administration which 
allow a pilot to operate an aircraft relying on visual observations (rather than cockpit instruments). 
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3.1 Directed Acyclic Graph G 

The directed acyclic graph G for a 2-aircraft encounter is shown in Figure[4] At any 
time f, the true system state of the mid-air encounter is represented by the world 
state S, which includes the states of all aircraft. Since the pilots (the players in this 
model) and TCAS hardware are not able to observe the world state perfectly, a layer 
of nodes is introduced to model observational noise and incomplete information. 
The variable W, represents pilot i's observation of the world state, while WtcaSi 
represents the observations of TCAS i's sensors. A simplified model of the current 
TCAS logic is then applied to WjcASj to emulate an RA 7]. Each pilot uses his own 
observations W,- and to choose an aircraft maneuver command A,. Finally, we 
produce the outcome H by simulating the aircraft states forward in time using a 
model of aircraft kinematics, and calculate the social welfare F. We will describe 
the details of these variables in the following sections. 




Fig. 4 Bayes net diagram of a 2-aircraft mid-air encounter: Each pilot chooses a vertical rate to 
maximize his expected utility based on his TCAS alert and a noisy partial observation of the world 
state. 
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3.2 Variable Spaces X 
3.2.1 Space of World State S 

The world state S contains all the states used to define the mid-air encounter environ- 
ment. It includes 10 states per aircraft to represent kinematics and pilot commands 
(see Table[T]i and 2 states per aircraft to indicate TCAS intent. Recall that TCAS has 
coordination functionality, where it broadcasts its intents to other aircraft to avoid 
issuing conflicting RAs. The TCAS intent variables are used to remember whether 
an aircraft has previously issued an RA, and if so, what was the sense (direction). 

Table 1 A description of aircraft kinematic states and pilot inputs. 



Variable Units Description 



X 


ft 


Aircraft position in x direction 


y 


ft 


Aircraft position in y direction 


z 


ft 


Aircraft position in z direction 


e 


rad 


Heading angle 


6 


rad/s 


Heading angle rate 


z 


ft/s 


Aircraft vertical speed 


f 


ft/s 


Aircraft forward speed 


<t>c 


rad 


Commanded aircraft roll angle 


Zc 


ft/s 


Commanded aircraft vertical speed 


fc 


ft/s 


Commanded aircraft forward speed 



3.2.2 Space of TCAS Observation W TC ASi 

Being a physical system, TCAS does not have full and direct access to the world 
state. Rather, it must rely on noisy partial observations of the world to make its 
decisions. WrcASj captures these observational imperfections, modeling TCAS sen- 
sor noise and limitations. Note that each aircraft has its own TCAS hardware and 
makes its own observations of the world. Consequently, observations are made from 
a particular aircraft's perspective. To be precise, we denote WrcASj to represent the 
observations that TCAS i makes of the world state, where TCAS ; is the TCAS 
system on board aircraft i. Table [2] describes each variable in WjcASr Variables are 
real-valued (or positively real-valued where the negative values do not have physical 
meaning). 

3.2.3 Space of TCAS RA 7- 

A simplified version of TCAS, called mini TCAS, is implemented based on |16| 
with minor modifications (we will discuss the differences in Section [3. 5. 3[ ). Mini 
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Table 2 A description of Wtcas, variables. 



Variable 


Unit 


Description 


n, 


ft 


Horizontal range between own and intruding aircraft 


h 


ft/s 


Horizontal range rate 


h 


ft/s 


Relative vertical rate between own and intruding aircraft 


h 


ft 


Relative altitude between own and intruding aircraft 


h t 


ft 


Own aircraft's altitude 



TCAS issues an RA 7} based on Wtcas, input, emulating the TCAS logic. The 
variable 7; represents the recommended target vertical rate issued to pilot i. We 
model Tj as able to take on one of 6 possible values: no RA issued, descend 
at 42 ft/s, descend at 25 ft/s, level-off, climb at 25 ft/s, or climb at 42 ft/s. i.e., 
Ti G {0, —42, —25,0,25,42}, where 7} = indicates no RA issued, otherwise 7} is 
specified in ft/s. 

3.2.4 Space of Pilot's Observation W, 

Aside from the RA, pilots have other sources of information about the world, such 
as those coming from instruments and visual observations. All paths of informa- 
tion are considered when the pilot decides his best course of action. Unfortunately, 
instruments and visuals also provide noisy partial observations of the world state. 

Properly speaking there are many intricacies that should be considered in the pi- 
lot observation model. First, the model should reflect the type and amount of infor- 
mation available via the cockpit instruments. Secondly, the model should reasonably 
approximate the visual observation characteristics and its limitations, such as field 
of view and geometry. For example, visual accuracy should decrease as distance 
increases, and moreover, visual observations of the intruding aircraft cannot be ac- 
quired altogether if the intruding aircraft is situated behind own aircraft. Lastly, the 
model should consider a pilot attention model, since pilots may miss detecting an 
intruding aircraft if they are preoccupied with other tasks. Attention and situational 
awareness are large topics of research in psychology and human factors especially 
under the context of pilots and military personnel lfTTl[T2l[30ll3Tll . 

As a first step, we do not consider the above subtleties, and begin with a very 
crude model for the pilot's observations. In particular, we model the pilot's obser- 
vation Wi as being a corrupted version of S. 

3.2.5 Space of Pilot's Move A, and Outcome 77 

At his decision point, pilot i observes his parent nodes and takes an action (i.e, sets 
the value of node A,). The variable A, is the target vertical rate for aircraft i repre- 
sented by a real-valued number between -50 and 50 ft/s. We take the outcome 77 as 
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being in the same space as S. We will later see how this facilitates the simulation of 
the encounter. 



3.3 Utility Function u 

The pilot's utility function summarizes in a real number the degree of happiness for 
a given joint outcome. It is a simple parameterization of the player that characterizes 
him and summarizes his preferences. Players act to maximize his expected utility. 
In this modeling, we assume all pilots have the same utility function. 

Properly speaking, the utility function should be learned from data to be as re- 
alistic as possible. However, since the task of learning parameters from data is a 
significant research topic of its own, it is being pursued as a separate effort. For 
now, the utility functions are crafted using intuition gained from pilot interviews. 
The authors found that pilots considered primarily 3 priorities when deciding how 
to react to a TCAS RA. 

1. Avoid collision. Pilots will do all that is necessary to avoid a collision, thus this 
has the highest priority. Since the fear of a collision increases as the aircraft get 
closer, a representative metric for measuring this impetus for collision avoidance 
is the minimum approach distance between aircraft d m j„ (i.e., the smallest sepa- 
ration distance between two aircraft over the entire encounter). 

2. Course deviation. There are many reasons why a pilot does not want to deviate 
from his current course. For example, deviations are often associated with longer 
flight times, higher fuel consumption, and increased flying effort. The notion is 
that if a collision is deemed unlikely (i.e., there's a high chance of TCAS being 
a false positive), then the pilot will be inclined to stay on his current course. We 
reflect this inclination by penalizing the difference between the current vertical 
speed and the vertical speed in consideration. 

3. Obeying TCAS. Pilots have indicated that when they feel uncertain that they 
will be inclined to follow TCAS. In other words, given all else equal, pilots have 
a natural tendency to follow RAs. This may be attributed to their training, their 
inclination to follow orders, or even blind trust in the system. We summarize this 
tendency into a metric by penalizing moves that deviate from the issued RA. 

In summary, utility function is chosen to be of the following form: 

ui = oci log (8 + d min ) - a 2 \z - at | - oc 3 1 Ti - a, | 

where OCi, (X2, and (23 are real positive constant weights, 5 is a small positive con- 
stant, a,- is the pilot's action, d m i n is the minimum approach distance between the 
aircraft, and z is the aircraft's current vertical speed. The weights reflect how the pi- 
lot trades off the three competing objectives. The weight a\ is largest, since avoiding 
collision is highest priority; CC2 is the second largest, followed by CC3 with the small- 
est weight. The log function in the first term is used to capture the fact that the rate 
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of utility increase/decrease is much faster when the aircraft are close together than 
when they are far apart. 



3.4 Partition R 



We partition the variables in the net into chance and decision nodes as follows: 
The nodes in the set {S, WtcaS\ > Wtca&i Ti, T2, W\ , W2,H} are chance nodes, and 
the nodes A\ and A2 are decision nodes. Moreover, player 1 sets the value of the 
node A 1 , and player 2 sets the value of the node A2 ■ 



3.5 Set of Conditional Probabilities % 

In this section, we describe the conditional probability distribution associated with 
each chance node. Note the use of stochastic terminology such as "sample" and 
"conditional probability distribution" for both stochastic and deterministic nodes. 
This is in light that we may view a deterministic node as stochastic with all its 
probability mass on its deterministic result. 



3.5.1 CPD of the World State S 



At the beginning of an encounter, the world state is initialized using the encounter 
generator (to be discussed in Section 3.7.3 1. Otherwise, the outcome// at time t — At 
becomes the world state S at time t . 



3.5.2 CPD of TCAS Observation Wjcas, 

To calculate WrcAs r the exact versions of the variables in Tableware first calculated 
from the world state S using the following equations: 

r h = \]{xj-Xi) 2 + (yj-yi) 2 

f h = — • ( (xj - x t ) [fj cos 0j - /,■ cos 0i ) + (yj - y ; ) (fj sin ; - - /,■ sin 0,- ) ) 
rh 

h = Zj-Zi 
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where the subscripts i and j indicate own and intruding aircraft, respectively. We 
then add zero-mean Gaussian nois <0to the variables to emulate sensor noise. 



3.5.3 CPDofTCASRA7} 



Recall from Section 3.2.3 that we use mini TCAS to emulate the full TCAS logic. 



The major assumptions of mini TCAS are: 

1. All aircraft are TCAS -equipped and coordinate RAs 



2. Actual horizontal range rate is use> 

3. No tracking or encounter monitoring over time is performed. Hence, mini TCAS 
is a memory-less system. 

4. No TCAS strengthenings or reversals (updates or revisions to the initial TCAS). 

5. The tau-rising test and horizontal miss distance test are not performed [ 16 1. 

The implementation of mini TCAS in this work follows closely that described in 
ifTBI . First, the range, altitude, and altitude separation tests are performed for colli- 
sion detection. If no potential collision is detected, no RA is issued. If a potential 
collision is detected, the algorithm then continues to determine the sense (direction) 
and strength of the RA. In the sense selection process, the algorithm determines 
which direction (ascend, descend, or level-off) gives the greatest vertical miss dis- 
tance. However, to account for TCAS coordination, a modification to the algorithm 
is made. To avoid issuing conflicting RAs, intruders' senses (up, level, or down) that 
appear in received intent messages are first removed from the list of candidate senses 
for own aircraft. The direction is chosen from the remaining choices. Strength se- 
lection follows to choose the least disruptive RA that still achieves the minimum 
safety distance. 

It is known that pilots react differently to a revised (second) RA than the initial 
one. Especially in cases of the RAs contradicting one another, the pilots may expe- 
rience cognitive dissonance, and even go into a confused mental state. As a result, 
to model this phenomenon properly would require a whole new level of pilot mod- 
eling, with perhaps separate models for the first and second RAs. One possible hack 
is to use the same model for both RAs. However, doing so would yield misleading 
results, since the pilot would experience no "mental conflict" to go against the pre- 
vious RA, and thus is much more likely to comply to any RA change. Alternatively, 
social welfare F could be hacked to demerit reversals or strenghtenings to RAs. For 
now though, we discard any encounters that issue reversals or strenghtenings. 



9 We assume independent noise for each variable h, h,hj with standard deviations of 
100,50,4, 10, 10, respectively. These variables are described in Table|2] 

10 In 1 16|. the horizontal range rate is fixed to -500 ft/s. 
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3.5.4 CPD of Pilot's Observation W; and Outcome H 



We model the pilot's observation W, as being S corrupted with additive zero-mean 
Gaussian noise ^j] The outcome H is calculated using the world state S, the pilot 
actions A\ ,A2, and the aircraft kinematics described in Section [3.7.4| 



3.6 Computing Level- K d- Relaxed Strategies 

Using the semi net-form game (G,X,u,R,n) specified previously, this section de- 
scribes the application of a modified version of Definition [JJ to calculate player i's 
strategies. Table [3] specifies the additional parameters needed by the algorithm. To 
be realistic, these model parameters should be learned from real data. However, 
for now, they are chosen by hand. For convenience, we define the new variable 
WTji, where i' is a dummy player index, to be the combination of the nodes W;/, 
WrcASn an d Tf. Let v be the node controlled by player i. Then, applying Defini- 
tionQwe see that v=A u pa(v) = {WiJi}, S v = {//}, S~ v = {S,WT h WT^,A^}, 
and Y = {S,WTCASi,WT^,A^}. 



Table 3 Specification of parameters in unnormalized likelihood-weighted level-K d-relaxed strate- 
gies (Definition]?} for the collision avoidance problem. 

Parameter Value Description 

K 2 Player level for all pilots 

M 5 Number of samples of pilot's own movespace 

M' 10 Number of samples of the pilot's environment 

A (A, | Wi,tj) Uniform over movespace Satisficing distribution of player i 
A°(Ai\wi,ti) Wide Gaussian (cr = 20) Level distribution of player i 
about RA 



We proceed following the steps of Definition [7] In step 1, we form a set 
WiU) '■ j = 1,---,M} by IID sampling A(A; | w,,f,) M times. Since the space of 
Ai is continuous, we do not need to worry about removing duplicates. 

The application of step 2 requires a slight modification. Recall that TCAS logic 
is deterministic, causing its probability P K {U \ w'tcas ) wnere U is tne observed real- 
ization of 7}, to be either 1 or 0. This creates a natural filtering effect that zeroes out 
entire posterior probabilities in the sum according to whether the scenarios cause 
the observed (evidence) RA to occur. In fact, since the space of the world state S is 
so large, we found the number of unusable samples to be impractically high. This 
rendered the straightforward application of Definition|7jinfeasible. 



11 We assume independent noise for each variable x,y,z,6,6,z,f with standard deviations of 
100, 100,20,0.05,0,5, 10, respectively. These variables are described in TableJT] 
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To help direct the samples toward the relevant subspace, we introduce impor- 
tance sampling to propose nodes S and WrcASt using their values sampled at the 
top-level s and wjcaS/ respectively. Note that the player does not have access to 
wrcASj or s - rather the simulator does. We use these variables to form the proposal 
distribution for approximating the expectation. More concretely, we form a set of 
weight-sample pairs {(w'(k),x? s ^ v (k)) : k = 1, . . .M'} by setting w\ = wi and t[ = f,-, 
and instead of sampling from P K (Xy) = Ilv'er ^(X-' I Xpa(v')) as described in step 
2 of Definition[7] we IID sample from: 

Q(Xy) = J! pK ^-' I X pa{V))Q{S I s)Q(W TCAS , I w TCASi ) 
v'eY\{s,w T cA Si } 

and adjust the weighting factor accordingly by multiplying w'(k) by: 

P K (s'(k)) P K ( W ' TCASi (k)\s'(k)) 
Q( S '(k)\s)Q(w' TCASj (k)\wTCA Si ) 

One can verify that this manipulation does not change the expected value [25 1. Re- 
call that S is composed of two parts: one that contains the kinematic states of the 
aircraft, and the other that represents the TCAS intent messages. We denote these 
variables as Sk and 57, respectively, and choose to propose them separately, i.e., 
Q(S | s) — Q(Sk I sk)Q{Si I Si). We choose Q(Sk \ $k) to be a tight Gaussian dis- 
tributiorpj centered about sk, and choose Q(Si \ si) to be a delta function about the 
true value si with probability q, or one of the following 4 values each with probabil- 

ity?(l-9): 

1 . No intent received. 

2. Intent received with an up sense. 

3. Intent received with a level-off sense. 

4. Intent received with a down sense. 

We choose Q(WrcASi I WrcASj) to be a tight Gaussian distributiorp^] centered about 

WTCASj ■ 

The trick, as always with importance sampling Monte Carlo, is to choose a pro- 
posal distribution that will result in low variance, and that is nonzero at all values 
where the integrand is nonzero 11251 . In this case, so long as the proposal distribu- 
tion over s' has full support, the second condition is met. So the remaining issue is 
how much variance there will be. Since Q(WrcASi I wjcas,) is a tight Gaussian by 
the choice of proposal distribution, values of w' rcAS will be very close to values of 
WTCASi causing P(f,- | w' TCAS .) to be much more likely to equal 1 than 0. To reduce 
the variance even further, rather than form M' samples of the distribution, samples 
of the proposal distribution are generated until M' of them have nonzero posterior. 



12 We assume independent noise for each variable x,y,z,9,9,z,f with standard deviations of 
5,5,2,0.01,0, 1,5, respectively. These variables are described in TableJT] 

13 We assume independent noise for each variable ri,,fh,h,h,hi with standard deviations of 
5,2,2, 2, 2, respectively. These variables are described in Table[2] 
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We continue at step 3. For each candidate action a';(j), we estimate its expected 
utility by sampling the outcome H from P K (H \ x' Y (;),a'j(j),Wi,ti), and computing 
the estimate uf . The weighting factor compensates for our use of a proposal distribu- 
tion to sample variables rather than sampling them from their natural distributions. 
Finally, in step 4, we choose the move a'^j*) that has the highest expected utility 
estimate. 



3.7 Encounter Simulation 

Up until now, we have presented a game which describes a particular instant t in 
time. In order to simulate an encounter to any degree of realism, we must consider 
how this game evolves with time. 

3.7.1 Time Extension of the Bayes Net 

Note that the timing of decisions^ is in reality stochastic as well as asynchronous. 
However, to consider a stochastic and asynchronous timing model would greatly 
increase the model's complexity. For example, the pilot would need to average over 
the other pilots' decision timing and vice versa. As a first step, we choose a much 
simpler timing model and make several simplifying assumptions. First, each pilot 
only gets to choose a single move, and he does so when he receives his initial RA. 
This move is maintained for the remainder of the encounter. Secondly, each pilot 
decides his move by playing a simultaneous move game with the other pilots (the 
game described by (G,X,u,R, n)). These assumptions effectively remove the timing 
stochasticity from the model. 

The choice of modeling as a simultaneous move game is an approximation, as 
it precludes the possibility of the player anticipating the timing of players' moves. 
Formally speaking, this would introduce an extra dimension in the level-K thinking, 
where the player would need to sample not only players' moves, but also the timing 
of such a move for all time in the past and future. However, it is noted that since 
the players are not able to observe each other's move directly (due to delays in 
pilot and aircraft response), it does not make a difference to him whether it was 
made in the past or simultaneously. This makes it reasonable to model the game 
as simultaneous move at the time of decision. The subtlety here is that the player's 
thinking should account for when his opponent made his move via imagining what 
his opponent would have seen at the time of decision. However, in this case, since 
our time horizons are short, this is a reasonable approximation. 

Figure [5] shows a revised Bayes net diagram - this time showing the extension 
to multiple time steps. Quantities indicated by hatching in the figure are passed 

14 We are referring to the time at which the player makes his decision, not the amount of time it 
takes for him to decide. Recall that level-K reasoning occurs only in the mind of the decision-maker 
and thus does not require any wall clock time. 
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between time steps. There are two types of variables to be passed. First, we have the 
aircraft states. Second, recall that TCAS intents are broadcasted to other aircraft as 
a coordination mechanism. These intents must also be passed on to influence future 
RAs. 



TCASi's RA Ti Outcome H TCAS 2 's RA T 2 




Social 
Welfare F 



Fig. 5 Time-extended Bayes net diagram of a 2-aircraft mid-air encounter: We use a simple timing 
model that allows each pilot to make a single move at the time he receives his TCAS alert. 



3.7.2 Simulation Flow Control 

Using the time-extended Bayes net as the basis for an inner loop, we add flow control 
to manage the simulation. Figure [6] shows a flow diagram for the simulation of a 
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single encounter. An encounter begins by randomly initializing a world state from 
the encounter generator (to be discussed in Section 3.7.3 I. From here, the main loop 
begins. 

First, the observational (W, and WjcASj) an d TCAS (7J) nodes are sampled. If a 
new RA is issued, the pilot receiving the new RA is allowed to choose a new move 
via a modified level-K d-relaxed strategy (described in Section [3~6*| ). Otherwise, the 
pilots maintain their current path. Note that in our model, a pilot may only make a 
move when he receives a new RA. Since TCAS strengthenings and reversals (i.e., 
updates or revisions to the initial RA) are not modeled, this implies that each pilot 
makes a maximum of one move per encounter. Given the world state and pilot com- 
mands, the aircraft states are simulated forward one time step, and social welfare 
(to be discussed in Section |3.8[ ) is calculated. If a near mid-air collision (NMAC) 
is detected (defined as having two aircraft separated less than 100 ft vertically and 
500 ft horizontally) then the encounter ends in collision and a social welfare value 
of zero is assigned. If an NMAC did not occur, successful resolution conditions (all 
aircraft have successfully passed each other) are checked. On successful resolution, 
the encounter ends without collision and the minimum approach distance of m ,„ is 
returned. If neither of the end conditions are met, the encounter continues at the top 
of the loop by sampling observational and TCAS nodes at the following time step. 



3.7.3 Encounter Generation 

The purpose of the encounter generator is to randomly initialize the world states 
in a manner that is genuinely representative of reality. For example, the encounters 
generated should be of realistic encounter geometries and configurations. One way 
to approach this would be to use real data, and moreover, devise a method to gener- 
ate fictitious encounters based on learning from real ones, such as that described in 
lfT4l [T5l. For now, the random geometric initialization described in [17| Section 6.1 
is usecP^l 



3.7.4 Aircraft Kinematics Model 

Since aircraft kinematic simulation is performed at the innermost step, its implemen- 
tation has an utmost impact on the overall system performance. To address compu- 
tational considerations, a simplified aircraft kinematics model is used in place of full 
aircraft dynamics. We justify these first-order kinematics in 2 ways: First, we note 
that high-frequency modes are unlikely to have a high impact at the time scales (~ 1 
min.) that we are dealing with in this modeling. Secondly, modern flight control 
systems operating on most (especially commercial) aircraft provide a fair amount of 
damping of high-frequency modes as well as provide a high degree of abstraction. 
We make the following assumptions in our model: 

15 The one modification is that t target (the initial time to collision between aircraft) is generated 
randomly from a uniform distribution between 40 s and 60 s rather than fixed at 40 s. 
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Fig. 6 Flow diagram of the encounter simulation process: We initialize the encounter using an 
encounter generator, and simulate forward in time using pilot commands and aircraft kinematics. 
The encounter ends when the aircraft have either collided or have successfully passed each other. 



1 . Only kinematics are modeled. Aerodynamics are not modeled. The assumption 
is that modern flight control systems abstract this from the pilot. 

2. No wind. Wind is not considered in this model. 

3. No sideslip. This model assumes that the aircraft velocity vector is always fully- 
aligned with its heading. 

4. Pitch angle is abstracted. Pitch angle is ignored. Instead, the pilot directly con- 
trols vertical rate. 
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5. Roll angle is abstracted. Roll angle is ignored. Instead, the pilot directly controls 
heading rate. 

Figure [7] shows the functional diagram of the kinematics model. The input com- 
mands are first applied as inputs to first-order linear filters to update 9, z, and /, 
these quantities are then used in the kinematic calculations to update the position 
and heading of the aircraft at the next time step. Intuitively, the filters provide the 
appropriate time response (transient) characteristics for the system, while the kine- 
matic calculations approximate the effects of the input commands on the aircraft's 
position and heading. 



Initial Heading Rate 

j 



Filtered 
"Heading Rate 



il Vertical Speed 

t 
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Kinematic update 
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~ Heading 
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Fig. 7 Aircraft kinematics model functional diagram: Pilot commands are passed to filters to model 
aircraft transient response to first order. Then aircraft kinematic equations based on forward Euler 
integration are applied. 



The kinematic update equations, based on forward Euler integration method, are 
given by: 

Qt+At = Ot+At-0, 
X t+At = Xt+At-ff cos 9 t 
y t +At = yt+ At- f t -sine, 

Zt+At = Zt+At-Zt 

Recall that a first-order filter requires two parameters: an initial value and a time 
constant. We set the filter's initial value to the pilot commands at the start of the 
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encounter, thus starting the filter at steady-state. The filter time constants are chosen 
by hand (using the best judgment of the designers) to approximate the behavior of 
mid-size commercial jets. Refinement of these parameters is the subject of future 
work. 



3.7.5 Modeling Details Regarding the Pilot's Move A, 

Recall that a pilot only gets to make a move when he receives a new RA. In fact, 
since strenghtenings and reversals are not modeled, the pilot will begin the sce- 
nario with a vertical speed, and get at most one chance to change it. At his decision 
point, the pilots engage in a simultaneous move game (described in Section [3~6] > to 
choose an aircraft escape maneuver. To model pilot reaction time, a 5-second de- 
lay is inserted between the time the player chooses his move, and when the aircraft 
maneuver is actually performed. 



3.8 Social Welfare F 

Social welfare function is a function specified a-priori that maps an instantiation of 
the Bayes net variables to a real number F . As a player's degree of happiness is 
summarized by his utility, social welfare is used to quantify the degree of happiness 
for the system as a whole. Consequently, this is the system-level metric that the 
system designer or operator seeks to maximize. As there are no restrictions on how 
to set the social utility function, it is up to the system designer to decide the system 
objectives. In practice, regulatory bodies, such as Federal Aviation Administration, 
or International Civil Aviation Organization, will likely be interested in defining the 
social welfare function in terms of a balance of safety and performance metrics. For 
now, social welfare is chosen to be the minimum approach distance d m i n . In other 
words, the system is interested in aircraft separation. 



3.9 Example Encounter 

To look at average behavior, one would execute a large number of encounters to 
collect statistics on F . To gain a deeper understanding of encounters, however, we 
examine encounters individually. Figure [8] shows 10 samples of the outcome distri- 
bution for an example encounter. Obviously, only a single outcome occurs in reality, 
but the trajectory spreads provide an insightful visualization of the distribution of 
outcomes. In this example, we can see (by visual inspection) that a mid-air collision 
is unlikely to occur in this encounter. Furthermore, we see that probabilistic pre- 
dictions by semi net-form game modeling provide a much more informative picture 
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than the deterministic predicted trajectory that the TCAS model assumes (shown by 
the thicker trajectory). 




Fig. 8 Predicted trajectories sampled from the outcome distribution of an example encounter: Each 
aircraft proceeds on a straight-line trajectory until the pilot receives an RA. At that point, the pilot 
uses level-K d-relaxed strategies to decide what vertical rate to execute. The resultant trajectories 
from 10 samples of the vertical rate are shown. The trajectory assumed by TCAS is shown as the 
thicker trajectory. 



3.10 Sensitivity Analysis 

Because of its sampling nature, level-K relaxed strategy and its variants are all well- 
suited for use with Monte Carlo techniques. In particular, such techniques can be 
used to assess performance of the overall system by measuring social welfare F (as 



defined in Section 3.8 I. Observing how F changes while varying parameters of the 
system can provide invaluable insights about a system. To demonstrate the power of 
this capability, parameter studies were performed on the mid-air encounter model, 
and sample results are shown in Figures |9p2| In each case, we observe expected 
social welfare while selected parameters are varied. Each datapoint represents the 
average of 1800 encounters. 

In Figure [9] the parameters M w and M\v TCAS , which are multiples on the noise 
standard deviations of W and Wjcas respectively, are plotted versus social welfare 
F , It can be seen that as the pilot and TCAS system's observations get noisier (e.g. 
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due to fog or faulty sensors), social welfare decreases. This agrees with our intuition. 
A noteworthy observation is that social welfare decreases faster with M w (i.e., when 
the pilot has a poor visual observation of his environment) than with Mw TCAS (i.e., 
noisy TCAS sensors). This would be especially relevant to, for example, a funder 
who is allocating resources for developing less noisy TCAS sensors versus more 
advanced panel displays for better situational awareness. 



"1 £ 

o ^= 
<r> 4 




Pilot's Observation 
Noise Mw 



TCAS Sensor Noise 



Fig. 9 Impacts of observational noise on social welfare: Social welfare is plotted against multiples 
on the noise standard deviations of W and Wtcas- We observe that social welfare decreases much 



faster with increase in than with increase in M\ 



w tcas 



This means that according to our model, 



pilots receive more information from their general observations of the world state than from the 
TCAS RA. 



Figure 10 shows the dependence of social welfare on selected TCAS internal 



logic parameters DMOD and ZTHR HI 611 . These parameters are primarily used to 
define the size of safety buffers around the aircraft, and thus it makes intuitive sense 
to observe an increase in F (in the manner that we've defined it) as these parameters 
are increased. Semi net-form game modeling gives full quantitative predictions in 
terms of a social welfare metric. 

Figure[TT|plots player utility weights versus social welfare. In general, the results 
agree with intuition that higher tt\ (stronger desire to avoid collision) and lower 
CC2 (weaker desire to stay on course) lead to higher social welfare. These results 
may be useful in quantifying the potential benefits of training programs, regulations, 
incentives, and other pilot behavior-shaping efforts. 
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5 



Fig. 10 Impacts of TCAS parameters DMOD and ZTHR on social welfare: We observe that social 
welfare increases as DMOD and ZTHR are increased. This agrees with our intuition since these 
parameters are used to define the size of safety buffers around the aircraft. 




Fig. 11 Impacts of player utility weights (see Section 3.3 on social welfare: We observe that 
higher d\ (more weight to avoiding collision) and lower oil (less weight to maintaining current 
course) leads to higher social welfare. 



Game theoretic modeling of pilot behavior during mid-air encounters 



33 



Figure [l2|pl ots model parameters M andM' versus F. Recall from our discussion 
in Section |2l6| that these parameters can be interpreted as a measure of the pilot's 
rationality. As such, we point out that these parameters are not ones that can be 
controlled, but rather ones that should be set as closely as possible to reflect reality. 
One way to estimate the "true" M and M' would be to learn them from real data. 
(Learning model parameters is the subject of a parallel research project.) A plot like 
Figure 12 allows a quick assessment of the sensitivity of F to M and M'. 




Fig. 12 Impacts of pilot model parameters M and M' (see Definition |7| on social welfare: We 
observe that as these parameters are increased, there is an increase in social welfare. This agrees 
with our interpretation of M and M' as measures of rationality. 



3.11 Potential Benefits of a Horizontal RA System 

Recall that due to high noise in TCAS' horizontal sensors, the current TCAS system 
issues only vertical RAs. In this section, we consider the potential benefits of a 
horizontal RA system. The goal of this work is not to propose a horizontal TCAS 
system design, but to demonstrate how semi net-form games can be used to evaluate 
new technologies. 

In order to accomplish this, we make a few assumptions. Without loss of gen- 
erality, we refer to the first aircraft to issue an RA as aircraft 1, and the second 
aircraft to issue an RA as aircraft 2. First, we notice that the variable WjcASj does 
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not contain relative heading information, which is crucial to properly discriminat- 
ing between various horizontal geometric configurations. In |[T8l , Kochenderfer et 
al. demonstrated that it is possible to track existing variables (range, range rate, 
bearing to intruder, etc.) over time using an unscented Kalman filter to estimate rel- 
ative heading and velocity of two aircraft. Furthermore, estimates of the steady-state 
tracking variances for these horizontal variables were provided. For simplicity, this 
work does not attempt to reproduce these results, but instead simply assumes that 
these variables exist and are available to the system. 

Secondly, until now the pilots have been restricted to making only vertical ma- 
neuvers. This restriction is now removed, allowing pilots to choose moves that have 
both horizontal and vertical components. However, we continue to assume enroute 
aircraft, and thus aircraft heading rates are initialized to zero at the start of the en- 
counter. Finally, we assume that the horizontal RA system is an augmentation to 
the existing TCAS system rather than a replacement. As a result, we first choose 
the vertical component using mini TCAS as was done previously, then select the 
horizontal RA component using a separate process. 

As a first step, we consider a reduced problem where we optimize the horizontal 
RA for aircraft 2 only; aircraft 1 is always issued a maintain heading horizontal 
RA. (Considering the full problem would require backward induction, which we do 
not tackle at this time.) For the game theoretic reasoning to be consistent, we make 
the assumption that the RA issuing order is known to not only the TCAS systems, 
but also the pilots. Presumably, the pilots would receive this order information via 
their intrument displays. To optimize the RA horizontal component for aircraft 2, we 
perform an exhaustive search over each of the five candidate horizontal RAs (hard 
left, moderate left, maintain heading, moderate right, and hard right) to determine its 
expected social welfare. The horizontal RA with the highest expected social welfare 
is selected and issued to the pilot. To compute expected social welfare, we simulate 
a number of counterfactual scenarios of the remainder of the encounter, and then 
average over them. 

To evaluate its performance, we compare the method described above (using ex- 
haustive search) to a system that issues a 'maintain heading' RA to both aircraft. 



Figure 13 shows the distribution of social welfare for each system. Not only does 
the exhaustive search method show a higher expected value of social welfare, it also 
displays an overall distribution shift, which is highly desirable. By considering the 
full shape of the distribution rather than just its expected value, we gain much more 
insight into the behavior of the underlying system. 



4 Advantages of Semi Net-Form Game Modeling 

There are many distinct benefits to using semi net-form game modeling. We elabo- 
rate in the following section. 

1. Fully probabilistic. Semi net-form game is a thoroughly probabilistic model, 
and thus represents all quantities in the system using random variables. As a re- 
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Fig. 13 A comparison of social welfare for two different horizontal RA systems: Not only does the 
expected value of social welfare increase by using the exhaustive search method, we also observe 
a shift upwards in the entire probability distribution. 



suit, not only are the probability distributions available for states of the Bayes 
net, they are also available for any metrics derived from those states. For the 
system designer, the probabilities offer an additional dimension of insight for 
design. For regulatory bodies, the notion of considering full probability distri- 
butions to set regulations represents a paradigm shift from the current mode of 
aviation operation. 

2. Modularity. A huge strength to using a Bayes net as the basis of modeling is 
that it decomposes a large joint probability into smaller ones using conditional 
independence. In particular, these smaller pieces have well-defined inputs and 
outputs, making them easily upgraded or replaced without affecting the entire 
net. One can imagine an ongoing modeling process that starts by using very 
crude models at the beginning, then progressively refining each component into 
higher fidelity ones. The interaction between components is facilitated by using 
the common language of probability. 

3. Computational human behavior model. Human-In-The-Loop (HITL) experi- 
ments (those that involve human pilots in mid- to high-fidelity simulation envi- 
ronments) are very tedious and expensive to perform because they involve care- 
fully crafted test scenarios and human participants. For the above reasons, HITL 
experiments produce very few data points relative to the number needed for sta- 
tistical significance. On the other hand, semi net-form games rely on mathemati- 
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cal modeling and numerical computations, and thus produce data at much lower 
cost. 

4. Computational convenience. Because semi net-form game algorithms are based 
on sampling, they enjoy many inherent advantages. First, Monte Carlo algorithms 
are easily parallelized to multiple processors, making them highly scalable and 
powerful. Secondly, we can improve the performance of our algorithms by using 
more sophisticated Monte Carlo techniques. 



5 Conclusions and Future Work 

In this chapter, we defined a framework called "Semi Network-Form Games," and 
showed how to apply that framework to predict pilot behavior in NMACs. As we 
have seen, such a predictive model of human behavior enables not only powerful 
analyses but also design optimization. Moreover, that method has many desirable 
features which include modularity, fully-probabilistic modeling capabilities, and 
computational convenience. 

The authors caution that since this study was performed using simplified models 
as well as uncalibrated parameters, that further studies be pursued to verify these 
findings. The authors point out that the primary focus of this work is to demonstrate 
the modeling technology, and thus a follow-on study is recommended to refine the 
model using experimental data. 

In future work, we plan to further develop the ideas in semi network-form games 
in the following ways. First, we will explore the use of alternative numerical tech- 
niques for calculating the conditional distribution describing a player's strategy 
P(X V | x pa ( v \), such as using variational calculations and belief propagation lfT9ll . 
Secondly, we wish to investigate how to learn semi net-form game model param- 
eters from real data. Lastly, we will develop a software library to facilitate semi 
net-form game modeling, analysis and design. The goal is to create a comprehen- 
sive tool that not only enables easy representation of any hybrid system using a 
semi net-form game, but also houses ready-to-use algorithms for performing learn- 
ing, analysis and design on those representations. We hope that such a tool would 
be useful in augmenting the current verification and validation process of hybrid 
systems in aviation. 

By building powerful models such as semi net-form games, we hope to augment 
the current qualitative methods (i.e., expert opinion, expensive HITL experiments) 
with computational human models to improve safety and performance for all hybrid 
systems. 
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